The Annals of Probability 

2004, Vol. 32, No. 3, 2149-2178 

DOI: 10.1214/009117904000000081 

© Institute of Mathematical Statistics. 2004 

MORE RIGOROUS RESULTS ON THE KAUFFMAN LEVIN 
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The purpose of this note is to provide proofs for some facts about 
the NK model of evolution proposed by Kauffman and Levin. In the 
case of normally distributed fitness summands, some of these facts 
have been previously conjectured and heuristics given. In particular, 
we provide rigorous asymptotic estimates for the number of local 
fitness maxima in the case when K is unbounded. We also examine 
the role of the individual fitness distribution and find the model to 
be quite robust with respect to this. 

1. Introduction. The purpose of this note is to provide proofs for some 
facts about the NK model. Some of these proofs have been previously formu- 
lated, at least approximately, as conjectures or heuristic arguments. Since 
we are interested in the mathematical analysis of the model, we include only 
a brief summary of the biological motivation, for which we can do no better 
than to excerpt and paraphrase from the introductory section of the paper 
by Evans and Steinsaltz [3]. 

Beginning with Sewall Wright in the early twentieth century, evolution 
has been modeled as the gradual motion of a genome through an abstract 
space, with a tendency toward increasing values of the fitness function. One 
may think of the graph of this function as a fitness landscape and of natu- 
ral selection as a random walk with upward drift on the fitness landscape. 
One cannot understand the likely behavior of such a random walk without 
understanding the quantitative nature of the landscape as one with "slivers 
of high fitness looming up above the vast genomic tohubohu" [3]. In any 
random walks model of fitness landscapes and natural selection, the nature 
of the global fitness maximum is less important than the number and height 
of local maxima. 
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Kauffman and Levin [7] introduced the NK model, which is a probabilis- 
tic model for the fitness landscape. In this model, there are N loci, at each 
of which is one of two possible alleles. Thus a genome is an element of the 
space {0,1}^. The fitness of a genome is the sum of N different fitnesses, 
the jth of which is determined by the alleles at sites j,j + l,...,j + K mod- 
ulo N. In the NK model, the 2 K+l alleles in the N possible positions are 
given fitnesses whose joint distribution is that of 2 K+l N i.i.d. picks from a 
distribution F. The fitness of a given genome is then the sum of the N fit- 
nesses corresponding to the actual string of K + 1 alleles beginning at each 
position. Note that this randomness is present in the model at the start; 
later one may model natural selection as a random walk in this random en- 
vironment, but that is beyond the scope of this paper. Evans and Steinsaltz 
pointed out that since the allele substrings of length K + 1 overlap, there 
is no easy way to find the optimal choice for the N alleles. They concluded 
that "while no one would mistake this abstract system for a realistic model 
of genetic evolution, it has the virtues of a good foundational model: it is 
easy to describe, yet contains a wealth of structure that is neither obvious 
nor superficially accessible. Before we can analyze a more realistic model, it 
would seem we must first come to grips with models such as this one. At 
the same time, we may hope that some general features of this model will 
carry over to something like the real world." 

Most studies of the NK model rely on simulations, which are limited to 
small to intermediate values of N (e.g., in [6], N = 96 and in [2], N = 
1024, which corresponds to the size of a gene, but it is much smaller than 
the number of genes in a genome). Simulations may provide quick answers 
to various questions in particular cases of fitness distribution F. However, 
a very interesting and natural question of robustness of the model under 
variations in F can be tackled only mathematically. 

We warn the reader that we always assume in this paper that the pa- 
rameter K is strictly positive and that the underlying distribution F is 
continuous. The NK model for K = or K = N — 1 exhibits special be- 
haviors which were rigorously analyzed by many authors (see, e.g., [7]). If 
F were not continuous, ties would be possible and analysis would become 
cumbersome. 

The study of the question to which our paper is devoted begins with [11], 
where Weinberger gives asymptotic formulae for the number of local fitness 
maxima (LFM) when N and K are large and F is the normal distribution. 
As noted in [2] and [3], however, Weinberger's derivation is not rigorous. 
Weinberger's heuristics are limited to the case where F is the normal distri- 
bution, although he points out that other distributions such as the Cauchy 
might be more realistic and that one could expect the outcome to be inde- 
pendent of the choice of distribution. 
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The majority of rigorous results that have been obtained assume that K 
is fixed and N — > oo. In this context, several results were obtained in two re- 
cent papers [2, 3]. Among other things, they both show ([3], Theorem 7, and 
[2], Theorem 2.1) that the exponential growth rate number of local maxima 
(or, equivalently, the exponential decay of the probability of a given genome 
being a local fitness maximum) exists as a limit. In other words, the proba- 
bility of a LFM decays like exp N(Xk + o(1)) as N — > oo with A remaining 
fixed. For K = 1, they computed this limit explicitly when F is the exponen- 
tial distribution [3] or the negative exponential [2]. In the case where F has 
an exponential moment, Durrett and Limic ([2], Theorem 5.1) made partial 
progress toward showing the number of local maxima (for large A, N) to 
be independent of the distribution F: they bounded the exponential rate on 
one side and they conjectured this to be correct to within a constant factor. 
The value of \k is theoretically possible to compute for certain distributions 
when A > 1, but practically impossible. It is biologically reasonable that A 
be on the order of at least several dozens, whence our interest in asymptotic 
formulae for Ax with error estimates that are valid as A, N — > oo without 
restriction. For example, in [6], pages 122-142, it is shown that maturation 
of the immune response fits the parameters A = 40 and N = 122, which is 
probably best described as tt N and A large, with N/K remaining bounded." 

The first purpose of this note is to rigorize Weinberger's computations for 
the normal case. This includes sharpening his statements to include error 
bounds and quantified asymptotic statements, specifically convergence uni- 
form in N as A — > oo. The second purpose is to investigate dependence on 
F. Specifically, we prove some asymptotic results that do not depend at all 
on the distribution of F, completing and generalizing the conjecture in [2], 
and we show some stronger results for the "fat-tail" case, which we believe 
to be the extreme opposite to the case where F has finite second moment. 

The remainder of the paper is organized as follows. The next section sets 
forth the notation and states our main results. Section 3 gives proofs for 
the results in which F is the normal distribution. Section 4 proves results 
for general distributions and derives asymptotics for fat-tailed distributions 
when N/K — > oo. Section 5 contains a detailed analysis of the case where 
F has fat tails and N/K remains bounded. Finally, Section 6 gives an ex- 
act expression for the exponential rate when F is the fat tail and K = 1, 
which, when compared with similar computations for other distributions, 
corroborates an extremality conjecture for the fat tail. 

We use notation o(l) to represent a term that converges to as K — > oo, 
O(l) to represent a term bounded by a constant and 0(expression(A)) to 
represent a term for which there are positive finite constants c, C (indepen- 
dent of K) such that cexpression(A) < term < C expression (A'). 
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2. Notation and statements of results. The parameters of the model are 
positive integers N > K and a continuous distribution function F on the 
real numbers. Our concern in this paper is with the number of LFMs for 
a random fitness landscape. The expectation of this number is equal to 2^ 
times the probability that any given genome is a local fitness maximum. 
Consequently, our sole focus is the rigorous estimation of this probability. 
Showing that the logarithm of the number of LFMs is near its expectation 
is not hard, but will not concern us here; see, for example, [2], Theorem 7.1, 
where an asymptotic normality result is obtained for the logarithm of the 
number of local fitness maxima. 

In the NK model the (unnormalized) fitness of a particular genome r] = 
(VI1V2, ■ ■ ■ , Vn) 6 {0, 1}^ is defined to be 

N 

(2- 1 ) ]C Y 0"> (vj,m+i: ■ ■ • » vj+k)), 

where the family 

{Y(j; (7/1,7/2,..., VK+i)) :j = l,...,N;( Vl , m ,..., m+1 ) G {0, 1}^ +1 } 

is the family oi of N ■ 2 K+1 i.i.d. random variables with common distribu- 
tion F. Suppose we are given such a family on a probability space (fl,F,P) 
and abbreviate 

Yj := Y(j; (0,0, ■■■,0)) 

to be the fitness of the substring of K + 1 zeros starting in position j; here 
and throughout, arithmetic on subscripts is always taken modulo N. With 
the above notation the fitness of the zero genome is Ylj=i Yj- 

The genome consisting of all 0's has N neighbors, namely all binary 
strings of length N with exactly one 1. Since in this paper we are only 
interested in the probability of the event that the string of all 0's is LFM, 
the only other relevant random variables from the above family are the fit- 
nesses Y(j; (771,7/2, • • • ,r)K+i)), where j = 1, . . . , N and where J2iVi = 1- We 
again abbreviate for l<j<N,0<i<K, 

Y hi :=Y(j-i;(0,...,l,...,0)), 

where 1 is only in the ith position above (here we count positions starting 
from 0). The quantity Yj i is interpreted as the fitness of the substring of 
length K + 1 starting at position j — i that is all 0's except for a single 1 
in position j. Then the definition (2.1) says that the string ej consisting of 
N — 1 0's and a single 1 in the jth position has fitness (in the new notation) 

j-K-l 

V; ■ y,M ■ y;,, Yj.k. 

i=j+l 
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The zero genome is a LFM if it has greater fitness than that of any genome 
with exactly one 1. We denote the event of optimality of the zero string by 
H. We may write TL = f)jTij, where Hj is the event that all O's are better 
than Gj. Equivalently, 

3 K 

(2.2) Hj<* Y i>J2 Y 3> 

i=j—K i=0 

Define pf(N, K) := P(7i). We usually suppress dependence on F and write 
simply p(N,K). Our first result makes rigorous and precise what is stated 
in [11]. 

Theorem 2.1. Suppose that F is the standard normal distribution. Then 

N 

logp(N, K) = —{- log K + R N:K ) 

with 

cloglogif > Rn,k > —cVlogK 

for some c > 0. 



Remarks, (i) Specializing to the case N/K — ► a, we obtain the estimate 
p(N,K) = ET~ 1 / a +°( 1 ) - (ii) The error terms are independent of N, so the 
previous estimate is uniform in N > K + 1 as K — ► oo; here and throughout, 
all asymptotic notation is with respect to K only (unless otherwise noted), 
(hi) In contrast to what will be the case with other distributions, there is no 
correction when N/K does not go to infinity, (iv) If K = N — 1, then the NK 
model is essentially different from the NK model where K < N — 1, but since 
p(N, N - 1) = 1/N + 1, it is still true that logp(N, N - 1) = - log(iV + 1) ~ 
— N log(A r — 1)/ (N — 1) with error smaller than the above bounds on i?jv JV-i 
for large N. 



Next, we state our most general result. 



Theorem 2.2. Let F be any distribution and N > 2(K + 1). Then 



(2.3) 
(2.4) 



logp(N,K)< -(l + o(l)) 
> -(3 + o(l)) 



K V ; 



log if 



N 
K 



log if. 



We believe that the upper bound (2.3) is sharp, so we make the following 
conjecture: 
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Conjecture 1. It is possible to replace 3 by 1 in (2.4). 

When sums of random variables are concerned, the class of most tightly 
clustered distributions comprises the distributions with finite variance, since 
these exhibit Gaussian behavior when summed. At the other extreme, one 
has distributions with extremely fat tails. In the limit, one might consider a 
distribution with the following property: In any collection of n i.i.d. picks, 
the greatest is much greater than the sum of the magnitudes of the others 
with probability tending exponentially rapidly to 1 as n — > oo. For example, 
if U is uniform on [0, 1], then exp(exp(l/[/)) has this property. In this case, 
as long as K — > oo at least as fast as log N, one may approximate Tij by the 
event 



Heuristically, properties of p(N, K) shared by fat-tailed distributions and 
normal distributions would be likely to hold for all distributions, since all 
others lie in between. One approach to establishing facts about fat-tailed 
distributions would be to axiomatize how fast the probability should tend 
to 1 of the event that the largest of n picks dominates all the others, and 
then prove theorems about distributions satisfying the axiom. We choose a 
less cumbersome approach, namely to provide an analysis of the probability 
of the event H' := ftjLi K'j- We use the notation p {at (N, K) to denote P(W) 
and sometimes call it u p(N,K) under the fat-tail distribution." Note that 
Pfat(N,K) is independent of F, assuming F is continuous. 

Conjecture 2. For any N and K, the infimum over all F ofpp(N,K) 
is equal to pf at (N, K) . 

Our next result shows that Conjecture 1 holds for the fat tail and thus 
that Conjecture 2 implies Conjecture 1. 

Theorem 2.3. We have 



Weinberger suggested the Cauchy as a biologically realistic distribution. 
Those readers who are bothered by a mythological distribution called the 
fat tail will perhaps be interested to see that the previous result for the fat 
tail may be proved for the Cauchy. We remark that the criterion we have 
suggested for axiomatization of the fat tail, namely exponential decay of 
the probability that the largest of n picks fails to dominate the sum of the 
others, requires much fatter tails than the Cauchy distribution possesses. 
Thus we view the following result as more than adequate to demonstrate 
that the fat-tail results hold for typical fat-tailed distributions. 



(2.5) 




log P{at (N,K)>-(l + o(l)) T7 + o(l) log if. 



KAUFFMAN-LEVIN EVOLUTION MODEL 



7 



Theorem 2.4. When F is a symmetric Cauchy distribution, 

'N 



logp(N,K)> -(l + o(l)) 



K 



+ o(l) 



log if. 



Comparing these last results to Theorem 2.1, we see that for the fat- 
tail and Cauchy distributions, and conjecturally for all distributions F, 
logpF^N, K) ~ logp$(iV, K), where $ is the normal c.d.f., as long as N/K — > 
oo: In this case the difference between N/K and \N/K + o(l)] is irrelevant 
and the formulae agree. Note that, on the other hand, if N/K ~ a, where 
a = m — 0.5 for some integer m, the difference between \N/K + o(l)] and 
\_N/K + o(l)J is 1, which amounts to the difference of 1/K in the asymptotic 
lower and upper bounds for p(N, K). It turns out there is, in fact, an asymp- 
totic inequivalence between logp<j>(iV, K) and logpf at (iV, K) when N/K does 
not go to infinity. Because of this, we include a more precise description of 
that asymptotics of logp(N, K) in this regime. 

The statement of the following theorem makes more sense if one keeps 
in mind how TC' is likely to occur. There will be at least ro := \N/K~\ large 
fitnesses among the Yj, which is the minimum number for which it is possible 
to have a large fitness in every window of size K. The number of ways to 
pick r large fitnesses increases with r, but the probability that any specific r 
fitness values are all large decreases with r. In this energy-entropy tradeoff, 
the maximum occurs at r = ro as N/K increases to ro — o(l), at which point 
the r value that achieves the maximum switches to ro + 1 . 

Theorem 2.5. As K — > oo with N/K bounded, there are formulae that 
give the value of pf^N, K) up to a factor of 1 + o(l). The formulae are in 
terms of functions {f r '-f > 3} on M + , which are defined by formula (5.7) in 
Section 5 and summarized in Table 1. Additionally, the functions f r satisfy 
the following statements: 

• /r(0)=0._ 

• / r (x)~x r_1 as x — > 0. 

• For r > 4, f r is increasing, continuous and bounded on [0, 1]. 

• For r = 3, f r is increasing and continuous on [0,1), with fs(l — t) ~ 
21og(l/t) as t^0+. 

In other words, there are narrow windows in the parameter N/K in which 
Pf ait (N,K) changes from roughly K~ r to K~ ( r + 1 ). These windows occur at 
N/K ~ r — K~ l ^ r ~ l \ An exception is when r = 2. In this case, the change 
from order K~ 2 to order K~^\ogK is complete at N = IK — clog-fT, after 
which the order slowly slides down to K~ 3 as \og(N — 2K) increases to 
log if. 
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Behavior of p{N, K) across integer values of N/K 



N 


3 


(l + o(l))put(N,K) 


2(K + l)-j 
(r-y)(K + l) 


0<j<K 
0<y<l 


1 (■ 1 1 1 log-K"-, 



A final result is the analysis for the fat tail when K = 1. Note that when 
K = 0(1), maxima are taken over collections of a bounded size, so no actual 
distribution has tails fat enough to ensure that the maximum dwarfs the 
others. Nevertheless, this result is still relevant to Conjecture 2. 

Theorem 2.6. We have 

N- 1 logp fat (iV, 1) ->■ z := - log 1.803 . . . = -0.58947 
where z is the solution of the Bessel equation 

= 7r\/6BesseII(§, § v 7 ^) - 7iV3iBesselI(-|, fv 7 ^) 
+ 3A/2BesselK(f , \^PTz) + 3x/iBesselK(|, fv^z). 

The published exact values of logp(A r , 1) for the exponential and negative 
exponential are, respectively, -0.57504... [3] and -0.5499934... [2]. The 
published lower bound for the uniform is —0.55957. . . [2]. All of these val- 
ues are greater than the value for the fat tail given by Theorem 2.6, thus 
providing further corroboration of Conjecture 2. 

Some final notation and methodology common to all the proofs is as 
follows. We let T = o~(Yj : 1 < j < N) be the cr-field generated by the fitnesses 
of zero substrings. We let F^ K+1 ^ denote the c.d.f. for the sum of K + 1 
independent picks from the distribution F. Conditional on the events 7ij 
are independent, with 

/j+K \ 

p(H j \F)=F( K+i nj2 Y i)- 

\ i=j / 

Removing the conditioning then gives a formula which appears as [11], (2.4), 

, N /j+K \ 

(2.6) p(N,K)= Y[F^ K+i nj2^)dF(Y 1 )---dF(Y N ). 

J j=l \i=j J 



3. Analysis of the normal case. The following facts are well known. 
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Lemma 3.1. If $ and (j) are the normal c.d.f. and density, respectively, 
then 



(3.1) 



log$(x) = (-1 + o(l))(l - $(x)) 

= 0(x)(x _1 + 0(x -2 )), X^OO, 



(3.2) (log$ )» = 2^<„ 
and 

(3.3) the function log$ is concave. 
Next we define the normalized total fitness 

A 7 " 

and the recentered window sums 

Y ._ (sjy^-)-((^+i)/v^)* 

It is immediate to verify that each Xj is a normal with mean and vari- 
ance 1 — (K + 1/N). Since the quantities Yj — t/yN are independent nor- 
mals recentered to sum to zero, their joint distribution is independent of 
the centering constant t. This can be verified explicitly by checking that 
the covariance of X and Yj — t/y/N is for each j. Consequently, since 
y/K + TXj = Ei=*{Yj - t/y/N), we see that 

(3.4) {Xj : 1 < j < N} is independent of t. 

Plugging this into (2.6) and using the fact that F^ K+l ^ is a normal of variance 
K + 1 , we get 

N 

(3.5) 



N / 

p(N,K)=El[^(x ] + 




Up to here we have followed Weinberger, arriving at [11], (3.2). Weinberger 
now asserts that Xj = 0(1) with mean zero, and may therefore be removed 
from the equation, resulting in p(N, K) ~ E$(i-y/ (K + 1) /N ) N , where t is a 
standard normal; this is then evaluated by steepest descent. Our contribu- 
tion in the rest of this section is to finish this properly, with one inequality 
(the upper bound on R) following directly from (3.3) of Lemma 3.1, rather 
than relying on independence of t and {Xj : 1 < j < N}. 
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Upper bound on R. By definition, the random variables Xj sum to zero. 
Using concavity of log$, we have the (deterministic) inequality 



f^og<f>(x j + t\ 
Plugging into (3.5) then gives 
(3.6) F(A)<E<S>(t 



'K+l 



K + l 



N 



N 



<N\og$[t\ 



'K + l 



N 



N 



'K+l 



N 



N 



4>(x) dx, 



where <f> is the normal density. Let I(x) = In,k( x ) denote the integrand 
in (3.6) and let M denote the maximum value of log/: 



M := maxlogi(x) = — log v2vr + max iVlog ^x^ 
If we can show that 

(3.7) log J I NtK (x) dx<M + 0(1) 



'K + l 



N 



x 

T 



and that 
(3.8) 



M 



N 
K 



(lo g K + 0(loglogK)), 



then the first inequality in Theorem 2.1 will be proved. Both computations 
are routine, and we need only one inequality of (3.8), but we include the 
arguments because they clarify matters by indicating the location of the 
saddle. 

To show (3.8), let x := V / (2N/(K + 1)) log(K + 1). Of course 
M >logI N>K (x Q ) 



log V2^ + 

log k ^ 



N 



N 
K 



K + l 

l + o(l 



[{K + 1) log$(V21og(K + l)) - log(JC + 1) 



V^gK + o(l) 



where we have used the estimate (3.1) from Lemma 3.1 on log$ and where 

the last o(l) accounts for — log\/27r. This shows one inequality in (3.8). For 

an upper bound on M, suppose first that x > y/{2N/{K + l)) x y/JJogJE + 1) - 2 log log (A" + 1)). 

Then 

x 2 N 

log I N , K (x) < - — = - - (log K + O (log log K) ) 
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as needed. On the other hand, when x < V / (2N/(K + 1)) x ^(\og(K + l) - 21oglog(i<C + 1)), 
then 

\ogI N>K {x) < - log + iVlog $(V2(log(K + 1) - 2 loglog^ + 1)) ) 

1 (log A) 2 



(_l + (l))iV 



K V21ogK-41oglogK 



<-(l + o(l))^(logK) 3 / 2 , 

so these values of x need not be considered and the other inequality in (3.8) 
is proved. 

Proving (3.7) is merely a matter of estimating the second derivative of 
log/. By log concavity of <£, this is at most the second derivative of logc/>, 
which is equal to —1/2. Let xm '■= xm(N,K) be such that In,k(xm) = M. 
Now an easy calculus argument (using log concavity) shows 

I(x) < exp{/(x M )}exp{-(x - x M ) 2 /4} = e M exp{-(x - x M ) 2 / 4 }> 

so that J e~ M In,k(x) dx is bounded above by a constant 2 J °° exp(— x 2 /4) dx 
that is independent of N and K, which shows that log / I{x) dx <M + 0(1) 
and finishes the proof of (3.7) and the first inequality of Theorem 2.1. 

Lower bound on R. Let G\ be the event that 



t > x\ := V(2N/(K + l))(log(K + 1) + 3 v / log(A' + lj). 

Let G 2 be the event that max|X,-| < 1. Due to independence of t from 
{Xj : 1 < j < N}, we may write 

P (N,K) >F(G 1 nG 2 )F(H\G 1 ,G 2 ) =nG 1 )F(G 2 )nn\G l ,G 2 ). 

We estimate this in pieces, the first being the one responsible for pushing R 
down to — cy/logK. 

Since log^(x) = —x 2 /2 + 0(1), we may estimate 



logP(Gi) = log(l - <Z>(V{2N/{K + l))(log( J PC + 1) + 3Vlog(K +Tj) )) 

= log ((1 + o{l))^W(2N/(K + l))(log(K + 1) + 3y1oiCrTT) ) ] 

TV , 

= °W " , ^ (M^ + 1 ) + 3Vlog(ir + l)) - logxi 

iV , 

= --(logK + 0(Vl^K)). 

Next, we estimate ¥(G 2 ). 
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Lemma 3.2. We have 

ir 2 N 

logP(G 2 )>- y -. 

Proof. Let Sj := Yji=\(Xi ~ t/VN) be the recentered partial sums. 
Then Xj = (K + l)~ l l 2 {Sj + K — Sj-i), with indices still taken modulo N. 
The event G 2 , defined by 

G' 2 := {\Sj\ < \\[K for all j < N}, 

implies the event G 2 . Let Wo be Wiener measure on continuous paths u on 
[0, N] starting at and let W hr be the Brownian bridge measure, that is, Wq 
conditioned on {ui{N) = 0}. The law of {Sj : 1 < j < N} is the law of partial 
sums of N i.i.d. standard normals conditioned on summing to zero; this is 
the same as the conditional law of {u>(j):l < j < N} under Wo, conditioned 
on {uj(N) = 0}, which is the same as the law of : 1 < j < N} under 

W^. 

A Brownian bridge always stays closer to the origin than unconstrained 
Brownian motion, in the following sense. In fact, it is not difficult to couple 
the path of the reflected simple random walk bridge (i.e., the absolute value 
of the random walk path conditioned to visit at time 2n) and the path 
of the reflected simple random walk up to step 2n so that the former stays 
below the later at all times with probability 1. Taking the diffusion limits in 
an appropriate way constructs one coupling of the reflected Brownian bridge 
and reflected Brownian motion described above. 

Letting G' 2 ' be the event that \u>{t)\ < \[K ' /2 for all t < N, we then have 

P(G 2 ) > P(G' 2 ) > < r (G' 2 ') > W (G' 2 r ). 

Let W/j, be Wiener measure started from distribution [i. Clearly W^{G 2 ) is 
maximized when [i = <5o; that is, Wo(G 2 ) > W^(G 2 ) for any (i. Now let \i 
be the distribution on [—^/K/2,\fK/2] with density C cos{ttx / \[K ) . This 
is an eigendensity for Brownian motion killed on exiting [— y/K/2, \[~K j2\ 
(see [8], Theorem 4.1.1). We see that 

/ 7T 2 i\T 

P(G 2 )>^(G 2 / ) = ex P f 



2 K 



proving the lemma. □ 



Finally, we estimate the third term. Recall from (3.5) the formula for the 
probability of LFM: 



p(N,K)=E 



N 



exp 

\j=i j 
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For x > 1, consider the inequality 



2(a; + 3 v / x)>y2( x /i + l) 2 -v / 2 + l = V / 2x + l, 

which can easily be checked, for example, by squaring both sides (note that 
if x > 1, then both sides of the inequality are strictly positive). Applying 
this inequality yields on G\ n G 2 , 



Xj + ty > \/ 2(\og(K + 1) + y/hg(K + 1) ) - 1 > y/2\og(K + 1). 

Therefore, on Gi D G 2 we then have for all j 



]og$(xj + tJ?j^ j >logd>(V21og(^ + l)) 



at. / AT 



and hence, using (3.5), 

P(W|Gi,G 2 ) > (cD(V21og(K + l))) W > exp v 
Plugging in the estimates for P(Gi) and P(G 2 ) then yields 

logp(N, K)>~{^- + \ogK + 0(Vtog~K)\ , 
which finishes the proof of the theorem. 



4. Proof of universality results. 



Proof of Theorem 2.2. First inequality. For the moment let the small 
positive real parameter y be unspecified. Break the interval from 1 to N into 
L := \_N/ (1 + y){K + 1)J intervals of length [(1 + y){K + 1)J , discarding any 
unused positions at the end. Denote these intervals I±, . . . ,Il and let Ij de- 
note the first \y(K + 1)] positions in Ij. Let Sj denote the index s £ Ij that 
maximizes S := Y^u=o^s,i- The maximum is a maximum of y(K + 1) inde- 
pendent draws from F^ K+l \ so Bj := F^ K ^' Q2i=o^s ■ ,i) has distribution 
P(l,y(K + 1)). The mean of Bj is 1 - (y(K + 1) + For the event H Sj 
to occur, the sum Yltl^sf must exceed S. Let T' = cr(Yj : i : 1 < j < N, < 
i < K) be the a-field generated by the fitnesses of substrings with exactly 
one 1. Then 

P{H Sj \F') = l-B j . 

Since \sj — Sk\ > K when j 7^ k, the events Ti s . are conditionally indepen- 
dent given T 1 ', and the B^s are mutually independent random variables. 
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Therefore, 

p(?o<p(n w.,) =Ep(n^i^) ^na-^)=( 1+ ^ +1) ) L . 

When K = o(N), we choose y = y{K) = o(l) to optimize this bound. For 
example, taking y = 1/log K gives an upper bound of exp(— (1 + o(l)) x 
(N/K)logK), as is required to prove (2.3). 

When K = Q(N), the same choice of y leads to the same conclusion, 
except that one has [N/y(K + 1)J in place of N/K. Since y(K) = o(l), this 
is again sufficient to prove (2.3). 

Second inequality. To prove (2.4), begin with the observation that the 
events TCj are increasing events with respect to the variables {Yj : 1 < j < N} 
and {— Yji -1 < j < N,0 <i < K}. By Harris' inequality, these are positively 
associated. Let L = \N/(K + 1)] and, for 1 <j < L, let 

j(K+l) 

Gj : = n «*■ 

i=(j-l)(K+l)+l 

Positive association implies that 

P(H)=P^f| Gj^j >F(Gi) L . 
Thus it suffices to establish 

(4.1) logP(Gi)>-(3 + o(l))logK 

Let ai := F (A ' +1) (E[tf ^i) for each le[l,K + l]. Then 

K+l 

P(Gi) = EP(Gi|jr)=E T[ ai . 

i=i 

E ai > 1 - l/K for each I G [1,K + 1], then Ilz=o a i > e ^ + ol 1 )' so C 4 - 1 ) 
follows from 

(4.2) P^min{aj:l</<^+l}>l-^ > cK~ 3 . 

Let J 7 * be the a-field generated by the unordered pair of sets {Yi, . . . , Yk+i} 
and {Yk+2, ■ ■ ■ , Y2K+2}- Then min{ai, clk+2} G J 7 *- Furthermore, conditional 
on T* , the collection {Si := X^=i(Y+_ft-+i — Yi) : 1 < Z < K + l} has exchange- 
able increments (generated by continuous distribution i.i.d. picks, so ties in 
the partial sum sequence S- happen with probability 0) that are symmetric 
about 0. Now note the following consequence of exchangeability: Conditioned 
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on all the increments, if their total sum is positive, then the probability that 
the minimum occurs at the beginning, that is, all the intermediate sums 
are positive, is at least 1/K. Namely, all cyclic permutations of the incre- 
ments are equally distributed and almost surely there is at least one such 
permutation for which the minimum is achieved at step 0. 
Therefore, 

P(min{5/ : 1 < I < K + 1} > 0) > \K~ X . 

When {minis'/ : 1 < I < K+ 1} > 0} occurs, we have minja; : 1 < / < K + 1} = 
a\. Hence, by conditioning on T* first, the probability on the left-hand side 
of (4.2) is at least 

\K- 1 ¥{m\n{a 1 ,a K +2] > 1 - 1/K), 

and by independence of a\ and ax+2 (recall that N > 2K + 1) this is equal 
to 

(l + oWK^K- 2 , 

proving (2.4). □ 

The proofs of Theorems 2.3 and 2.4 are similar to the argument used to 
prove the second inequality of Theorem 2.2. Having specific distributions to 
work with makes the arguments simpler and the results sharper (cf. Conjec- 
ture 1). 

Proof of Theorem 2.3. Cover the interval [N] :={1,...,N} with 
L := \N/((1 -y)(l + K))] intervals of size [(1 - y)(K + 1)] . Denote these 
intervals by I\ , . . . , II. Positive association again implies that 

F(H') > F(H'j VjG/i) L . 

Let /' denote the interval of length [y(K + l)\ adjacent to and just preceding 
I\. If the maximum of the collection {Yj,Yi^:j G I', I G 1%, < i < K} is 
Yj for some jo El', it follows that Hj occurs for each j G I\. The last 
claim follows directly from definition (2.5) since for such jo we have Yj < 
max J i= j_ K Yj whenever j G I\. 
The probability of 

< max K: > max Yi > , 
be/' 1 ieh,o<i<K M J 

up to corrections for integer roundoff, is clearly equal to y{K + 1)/[(1 — 
y){K + l) 2 + y(K + l)}. Thus 
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Choosing y = y(K) = 1/log K as before suffices to prove the theorem. □ 

Proof of Theorem 2.4. Keeping the notation from the previous proof, 
we need to estimate ¥{TLj\/j E I{) when F is the Cauchy distribution. Define 
events: 



(i) A 

(ii) B 

(iii) C 



{max jeI tYj > 2(K + l) 2 }; 

{E je / lU /„ov(-y J )<(K + i) 2 } ; 

{max ie/l E^ ^<(^ + l) 2 }- 



Here Iq is the interval of length K preceding I\ so that for y(K) < 1 (which 
will be the case) I' C 7 - Note that AnBnCc Hjeh W i since onAnBnC 
we have, if j • E Ji, both 

r^y J >maxy i + X! (ov(-y J ))>(A'+i) 2 | 

I i=j 3 jehulo ) 



and 



It is not difficult to check that 

P( j 4) = ((2tt)- 1 + (1))^- 1 , 

P(C)=exp( ~ (1 ~ y) )+ (l). 



Indeed, 



p(A) = 1 -P\maxYj < 2{K + l) 2 



(l/7r + Q (l)) V^+ 1 ) y ( 1 

" 2(K + i) 2 J = -¥TlU + o(1 

K + 1 

P(P) > P( max V (-Y,) < 



jehulo 2 — y 

*(kTT)) " exp l ^J +o(1) 

and 

K \ (l-i/)(Jf+l) , . 



p(C) = p(£y M <(A + i) 2 ) =(/ ^Fo+p)* 



.i=0 
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(l/7T + o(l))V 1 -^ +1 ) / \, 

}~ K + i ) =«p(-5fP-»)I 



Another application of positive association shows that 

(^T + °( 1 ))| 



so that 

P(W) > 



e_5/ ' + »(i)H 



L 



2vr v ' )K_ 

and taking the logarithm, with y(K) = \og(K)~ l , completes the proof. □ 

5. The fat tail when N/K remains bounded. This section provides a 
proof of Theorem 2.5. In particular, in this section we derive asymptotic 
formulae for Pf a t(N, K) that are valid as N, K — > oo, uniformly as long as 
N/K remains bounded. Probability estimates come from the following al- 
gorithm for checking whether Ti' has occurred. 

1. Initialize r = 1 and C to be the collection of variables {Yj,Yj^: 1 < j < 
N,0<i<K}. 

2. Find the maximum of the variables in C. 

3. (a) If this maximum is one of the variables Y~ i, then output FALSE 
and stop. 

(b) Else, let j r be the index such that the maximum occurred at Yj r . 

4. Remove from C the variables Yji for j± < j • < j\ + K, < i < K (these 
are no longer relevant since no matter what their value is, we know that 
f^Jj^ Ttj has occurred, and other TC^s do not depend on the values of 
Yj i, ji < j < ji + K, < i < K, anyhow), and also remove the variable 

5. (a) If the collection C contains no more variables Y~ i, then output 
TRUE and stop. 

(b) Else, set r to r + 1 and go to Step 2. 

Clearly Ti' = {algorithm stops at TRUE}. We may think of the output 
as containing all values of j r found before stopping, so that in addition to 
the indicator function of the event Ti', the algorithm outputs the random 
variables R,ji, ■ ■ ■ , Jr, where R is the maximum value for which the first 
Step 3(b) (the else statement) is executed. Recall that tq := \N/(K + 1)] 
is a lower bound for R, provided the output is TRUE. The possible values 
for the sequence j when it is of length R = r are precisely the set S(r) of 
sequences that satisfy both of the following statements: 

(*) For every i S [iV] there is an s < r for which < i — j s < K. 
(**) No initial segment of j satisfies property (*). 
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, L_ 



111 

d i J 



K+ 1 



t 

N = (r-y)(K+l) 



Fig. 1. 



Letting W(j) denote the event that Ti' occurs and the algorithm outputs 
the witnessing sequence j, we may decompose Ti 1 into a disjoint union by 
setting H(r) := UjeS(r) W( j) and 

H' = \JH{t) = U U W(J)- 

r r jes(r) 

Given 1 < s < r + 1 and any sequence j of length r containing distinct 
elements of [N], define 

missed(s, j) := {j €[N]:j-j t ${0,...,K}Vt< a}, 

M(sJ) := | missed 

Vacuously, M(l,j) = N for all j. Figure 1 illustrates this definition when 
ro = 4. In the illustration, the intervals [j s , ■ ■ ■ ,j a + K] are shaded, j\ is equal 
to K + 1, one interval overlaps with [1, K + 1] modulo N and the other two 
intervals also overlap. Figure 1 also illustrates a general fact, namely that 
the set missed(s,j) (the white space between the shaded intervals) is always 
composed of no more than s intervals (i.e., the unshaded set has at most s 
connected pieces), where adjacent white intervals are separated by a distance 
of at least K + 1 . 

One further observation is that for all s and j , 

(5.1) JV>M(a,j)>JV-(s-l)(iir+l). 

Conditional on the event R > r + 1 and on j\ , . . . , j r , the values of the vari- 
ables remaining in C at stage r are i.i.d., so the conditional probability of 
jr+i = j for any j ^ {ji, . . . ,j r } is equal to the reciprocal of the number of 
variables remaining in C, that is, 1/(N — r + (K + l)M(r, j)). Applying this 
inductively yields 

(5 - 2) ^ q »-n *-(.-i )+ fr + i)M(.,j) - 

The (K + l)M(s, j) contribution above comes from the number of Yij 
variables that are still in C. The above computation can be generalized in 
the following useful way. Define the event TC*(j) by 

H {j is an initial segment of the output of the algorithm}. 
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When j of length r is an element of S(r), = H*(j); otherwise TC(j) 

is empty and the right-hand side in (5.2) computes the probability of out- 
putting j as an initial segment. To obtain P(W*(j)) from this, one must mul- 
tiply the right-hand side in (5.2) by the probability Q(j) that, conditional 
on the initial segment being j, the algorithm eventually outputs TRUE. 
We compute an upper bound on Q(j), for j of length r, as follows. For 
each interval I = [a, b] C missed(r, j), for the TC*(j) to happen, it is necessary 
that max a _K<j<bYj be greater than maXj 6 / : o<j<X Yj,i- This probability of 
{max a _ K <j< b Yj > max je j )0 <i<if equals 

b + K + l-a _ (b + l-a) + K 

b + K + l-a + {b+l-a)K ~ b + 1 - a + (6 + 2 - a)K 

1 1 

< T7 7 + 



K+l b+2-a 

If missed (r, j) is composed of more than one interval, the probabilities for 
each interval are multiplied (since they are at least K + l units apart, every- 
thing is independent) and, therefore, for a given M(r,j), the upper bound 
on Q(j) is greatest when missed has only one interval and we may take as 
an upper bound 

< 5 - 3 » « j) ^M(b + r 

We now bound the number of sequences j that produce a given value of 
M(r,j). 

Lemma 5.1. Let N = (r — y)(K + 1). Then the number of sequences j 
of length r with M(r + 1, j) = i is at most 

NC{r){yK + i) r - 2 . 

Proof. By symmetry, it suffices to consider only sequences for which 
ji < ■ ■ ■ < j r i n cyclic order modulo N and then multiply by (r — 1)!. By 
convention, we let jo := j r — N. For 1 < s < r, consider the quantities A s := 
j s -i + K + 1 — j s to be unknown and satisfying the following two nice prop- 
erties: 

(a) ELi A = jo ~ jr + r(K + 1) = -N + r(K + 1) = y(K + 1); 

(b) ELl(- A,) V = ELi [Us ~ Js-i) ~ (K + 1)] A = l. 

Property (b) is a consequence of the fact that the length of the unique 
(white) interval that contributes to missed (r + l,j), which is contained in 
\js-ujs], equals [(j s —j s -i) — (K+ 1)] AO. The sequence (Ai,...,A r ) and the 
value ji together determine j. The number of possible sequences (A\, . . . , A r ) 
above may be bounded as follows. Let S+ be the set of indices i for which 
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Ai>0. Given 5+, the subsequence (Aj : i 6 5+) is a sequence of nonnegative 
integers that sum to y(K + 1) + l. These sequences are called compositions 
of y{K + 1) + l into | | parts, and the number of such compositions is 
( v[K+ ]l + + \t[ S+hl ) ([10], page 14). Similarly, (Ai-.i$ S + ) is a composition 
of l into r — |5+| parts, and the number of these is (^^J^ll^ 1 ) • We claim 
that the product of the above two binomial coefficients is bounded above by 
Co(r)(y(K + 1) + i) r ~ 2 . Indeed, the product equals 

(y(K + l) + t, + \S+\ -1)! (t + r- |5+|-1)! 
(\S+\-l)\(y(K + l)+o)\ ' (L)\(r-\S+\-l)\- 

Clearly 15+1 < y(K + 1) + i and r — |5+| < l, which implies 

fa(/f + l) + 1+ | S+ |-l)! ^ mK + + i)]|s+| _, 



and 



(y(JT + l) + 



(t + r- |5+|-1) 



(0! 

Thus, for a given 5+, there are at most NC (r)(y(K + 1) + /,f~ 2 such j 
sequences (N comes from the choice of ji). Summing over at most 2 r — 2 
values of 5+ proves the lemma. □ 

As mentioned prior to the statement of Theorem 2.6, the complexity in the 
behavior of pf a t (N, K) is due to transitions in the number of Yj variables with 
large values from one integer to the next higher. We separate the argument 
into several cases, the first three being restricted to ro = \N/K] > 3: 

1. r - 1 + e < N/(K + 1) < r - e; 

2. r -e< N/(K + l) <r ; 

3. r - 1 < N/(K + 1) < r - 1 + e; 

4. r = 2. 

The analyses of Cases 2 and 3 actually cover Case 1 since one could take 
e = 1/2, but since the argument is easier for values of N/(K + 1) not too 
close to an integer, we prefer to present this as the first case. 

Case 1. We first compute P(W(ro)). For each j G 5(ro) and each s < ro, 
the expression (5.2) and the bounds (5.1) imply 

c(e) < K 2r °¥{H{])) < C{e) (recall N ~ r K). 

Together with the fact that 5(ro) has cardinality Q(K r °) (see below for 
details), this immediately implies that 

F(H(r )) = O(K- r °). 
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In this case, we claim that ¥(7i(r)) is maximized at r = vq. With ¥(TC(r — 1)) 
trivially being zero, this statement and the theorem follow from a more 
precise estimate of P(H(ro)) and a bound on F(TC*(ro + 1)). 

Let T be the ro-dimensional torus of ro-tuples in M/Z, with addition 
modulo 1 and unit Lebesgue measure A. For y G [0, ro], define a subset T(y) = 
T(y,ro) C T to be the set of x = (x±, . . . ,x ro ) such that for all z there is a 
j < with Xj — 1 / {tq — y) < z < xj. Consider the mapping of S(ro) into T 
by 

(5.4) j-j/JV. 

The set S(ro) then maps into the set T(y) for y = rQ — (N / (K + 1)). In fact, 
for any U C T(y), the cardinality of the subset of <S(ro) that maps into U 
under (5.4) is equal to (1 + o(l))N r ° X(U) uniformly in N/K as N — > oo. 
Furthermore, for j G S(ro), 

(5.5) P(W(j)) = (1 + oCl))^^^^ 



where 



(5.6) t?(x) = n 



ro 1 



i M(s,x) 

and M(s,x) is the measure of [0, 1] \ U?=i [z t - -KT/iV, a^] . Let 2/ = r -N/(K + 
1) and note that y equals j/(K + 1) when N = r$(K + 1) — j for j > 0. By 
bounded convergence, we then have 

(5.7) K r °¥(H(r )) — > f n ,(y) ■= f r,(x)dA(x) 

JT(s/Al;r ) 

(note here that since y G [e, 1 — e], y A 1 = y) as N — > oo, uniformly in N/K, 
with /r (-) bounded, continuous and nondecreasing. This is the f r term in 
the last line of Table 1. 

Next, we compute an upper bound for the event 7i*{r^ + l) := U{^*(j) : J ^ 
S(ro), |j| = ?~o} that an output of TRUE requires at least tq + 1 covering inter- 
vals. Multiplying the right-hand side of (5.2) by Q(ro, j) = Q(j), using (5.3) 
with r = ro and using the fact that M(s, j) > C(e)K for s < ro, we see that 

r ° 1 

s=lM(r +lj)=s 
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where C represents a constant that depends only on ro and e, and the sum 
is over sequences j of length ro. By Lemma 5.1, we may further bound this 
from above by 

nmro + 1)) < e c(3,k + «r- 2 ^ < r + l) 

1 ^ 1 / s \ r °~ 1 / ^ 1 N 

(5-8) <c^E-(y + ^) (Ei - iog(^) ~ M*0 + c 

<C 



8=1 X 7 \s = l 

,/log# 



Together with (5.7), this establishes that 
(5.9) K r °F(H')^f ro (y). 

When e < y < 1 — e, the term containing f r in the last line of Table 1 dom- 
inates the term containing f T +i since f r (e) > 0, so this proves the theorem 
in the case e < y < 1 — e and N/ (K + 1) > 3. 



Case 2. This is quite similar to the previous case. The part where we 
estimated (5.7) goes through unchanged, only now f ro tends to zero as 
N/(K + 1) — ► and we need to find the asymptotic rate to compare to 
the fr +i term. 

Lemma 5.2. The measure A(T(y, ro)) o/T(y, ro) is asymptotically y T °~ l / 
(ro — y) r ° _1 near y = 0. 



Proof. The set T(y) is invariant under translation of each coordinate 
by a constant, so by symmetry the measure is the same as the (ro — 1)- 
dimensional measure of the fiber of T(y), where x\ = 0. By permutation 
invariance, this is equal to (ro — 1)! times the measure of the subset of 
T(y), where = xi < x<i • • • < x ro . Such a point is in T(y) if and only if the 
quantities Xi + l/(ro — y) — a^i+i, for 1 < i < ro — 1, are positive numbers 
summing to at most y/(ro — y). In fact, the mapping that maps each x 
in the fiber to the sequence {x\ + K/N — X2, ■ ■ ■ , x ro -± + K/N — x rQ ) is an 
isometry. The (ro — 1) -dimensional simplex of positive numbers summing to 
at most y/(ro — y) has volume y r °~ 1 /((ro — y) r °~ 1 (ro — 1)!), which proves 
the lemma. □ 



As y — > 0, the factors 1/M(s,x) converge to ro/(ro — (s — 1)), since the 
only way for a vector to be in T(y) is for it to have ro approximately evenly 
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spaced coordinates. Therefore, the function r\ defined in (5.6) converges to 
the constant r r Q ° /tq\ on T(y), and we have 

f ro (y) = / r?(x)dA(x) ~ -jX(T(y)) ~ U — T . 
JT(y) r \ r ! 

Since the contribution of ¥(7i(rQ + 1)) to ¥(7i') is no longer negligible, we 
must compute it a little more precisely as well. If we write it as an integral 
analogous to (5.7), we find, for ro > 3, that the integral JY(i) r ?( x ) d\(x) exists 
as an improper integral, but the integral over T(y) diverges for y > 1. We 
have shown that K r °¥(Tl(ro)) ~ y T0 /(fo — 1)' as y — > 0, and we have an 
upper bound (5.8) on F(H*(ro + 1)). When y > K' 1 ^' , these two together 
show that still 

K r °V(H')~f ro (y). 

Assume therefore that 

(5.10) y<K~ 1/ro . 

We cannot immediately conclude for < y < K" 1 ^ that 

^ +1 P(H*(r + l))^/ ro+1 (l) 

and it is our remaining task to verify the above statement. One part of this 
is easy. For any positive L, the function rjl^i is bounded and, as L — > oo, 
these functions converge in L 1 to 77 as long as rj E L 1 , which is the case since 
we have assumed that ro > 3. Equivalently, the function 

9(L):= i](x.)l v{x)>L d\(x) 
JT(l;r +i) 

converges to as L — > 00 and, by bounded convergence, we may approximate 
the truncated sum of the terms in (5.5) by a truncated integral as K — > 00: 

(5.11) K ro+1 F(H(r + 1) n { V ( j/N) < L}) -+ (1 - g(L))f ro+1 (l). 
The theorem, in Case 2, follows if we can show that 

(5.12) v(v(V\ <L,7i*(r + 2)) <C(L)- 



N J ~ ' J ~ ; ^o+i 



for c(L) ^0 as L — > 00, uniformly in ET. Indeed if these two hold, then for 
L large enough so that c(L) < 5/2 and X then chosen large enough so that 
C{L)/K < 5/2, we have 
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which together with (5.11) finishes Case 2. 

To prove (5.12), we may use the same argument that proved (5.8), but 
with ro replaced by ro + 1. We sum over sequences j of length ro + 1 to get 



i \ \ ro+1 1 

*) * ^ + 2> ) £ ws £ 1) Q(j) 3 N -(s-l) + (K + l)M(s,j) 

N x 1 

<E E ^) 7 ^5- 

s=lM(r +lj)=s 

Here we have used the fact that r/( j/N) < L to bound the product in the 
first line by C(L)K~ 2ro ~ 2 ; equation (5.3) is valid for any r, so there is no 
trouble replacing ro by ro + 1 here. At the next step, instead of requiring 
Lemma 5.1, we require only the trivial bound on the number of sequences 
j of length ro + 1 with M(ro + 2,j) =j, namely CK r ° . Following the path 
to (5.8) leads this time to (5.12). 

To prove (5.13), observe first that r](j/N) > L implies M(ro + l,j) < 
e(L)K for some function e(L) going to zero as L — > oo. This follows from 
expression (5.2), according to which all the factors 1/M(s, j) in the definition 
of rj are bounded from below except for the factor with s = ro + 1, which is 
of order K/M{tq + 1, j). Hence, 

rj(±)>L,H*(r + l) 



e(L)K ro 

< E E Q(r ,m 



s=l j:A/(r +lj)=s t= 



\N-(t-l) + (K + l)M(t,i) 



e(L)K 

< J2 C(r ,e)(N -r K + 2SY - 1 - K~ 2ro 

S = l 

- K r +1 \ K UT K g 

s=l 

This sum is at most twice the integral for which it is an upper Riemann. 
To be precise, we consider the sum as a step function, change variables to 
x = (s + 1) /K, and compare the upper and lower Riemann sums to integrals, 
concluding that 

K ro+i F Lfl \ > L,n*(r + r 



V \N 



<2C(r , £ )/ o (^-n, + 2xj -^d*. 
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As a family of functions on [0, 1], the integrands form a uniformly integrable 
family as long as A/ A — vq < K~ a for some A. By assumption (5.10), this 
inequality is indeed satisfied, and we may conclude that the integral from 
to e{L) tends to zero uniformly in A as e(L) — > 0. This finishes the proof 
of (5.13) and therefore of Case 2. We go onto Case 4, coming back to Case 3 
later since it uses some of the computations from Case 4. 

Case 4. When ro = 2, the computation is particularly simple without 
using the continuous approximation. The first term in the product in (5.2) 
is always 1/(A(A + 2)). By symmetry, 

F(H(2)) = N ]T P(W(j)). 

jes(2) 

For ji = l, so that j satisfies property (*), it is necessary to choose A — A < 
32 < A + 2. Also, if j\ = 1, then missed( j, 2) = {A + 2, . . . , A} and the second 
factor in (5.2) is always 1/(A - 1 + (K + l)(N - K - 1)). Thus, letting 
j = 2(K + 1) - N G {0, . . . ,K}, we have 

P m)) N 2K+ '~ N 



(5.14) 



N{K + 2) N - 1 + (K + 1)(N - K - 1) 

_L i + i 

K + 2N-1 + (N-K- 1)(K + 1) 
1 j + 1 



(1 + °(1)) 



i^ 2 AT - (K + 1) + ((A - 1)/(A + 1)) 
1 j + 1 



A 2 A + 3 - j' 



For P(W(3)), a similarly direct argument ensues. If 7i(3) occurs via H(j) 
for some j G 5(3) with j\ = 1, then since TC(2) does not occur, either ]\ G 
[2, A-(A'+1)] or j 2 G [A + 3, A]. In the former case, j 3 G [A- A, j 2 + A + l], 
while in the latter case, j'3 G [j'2 — A — 1 , A + 2] . For the first of the two cases, 
we then have a contribution to p(7i(3)) of 

1 N-K-l 32+K+l 1 

E E 



1 



(Ar_2) + (A-j 2 -A)(A+l) 
1 



(5.15) 



A + 2A-1 + (A + 1- j)(K + 1) 
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N—K—l ■ , ■ 



N-2 + (K + l)(N-K-j 2 ) 



N-K-2 



H + o(D)- ] ° SK 



1 - s/K 



K\l-j/KY 

Here the third equality comes from the substitution s = N — K — ji and the 
definition of j as 2(K + 1) — N, while the (1 + o(l)) term comes from factors 
of order (1 + 0(1/ K)) that remain once we remove three factors of K from 
the top and bottom of the fraction preceding the sum and one factor of K 
from the top and bottom of the summand. The computation for the second 
case is symmetrical, leading to 

\ogK 



(5.16) p(W(3)) = (2 + o(l)) 



K*(l-j/K) 



Comparing (5.16) to (5.14), we see that the former is dominant when 
j = o(\ogK), the latter when log-ff = o(j) and both contribute when j = 
Q(logK). In particular, (5.14) contributes only when j — > oo, in which case 
the contribution is (1 + o(l))-^( K ^_. — -^), while (5.16) contributes only 
when j = o(K), in which case the contribution is (2 + o(l))log K / K s . Prom 
these the first line in the Table 1 follows as a lower bound, with an identical 
upper bound yet to follow if we show that changing Tt(3) to 7Y*(4) produces 
no change to the asymptotics. 

The difference between 7i(2>) and Ti.*(4) is that in the latter case, can 
be element of missed(3, (31,32))- These are all j' not in the interval [1,^2], so 
the numerator j + 32 of (5.15) becomes N — 32. This changes the 1 — s/K in 
the numerator of the subsequent line to 1 + s/K, which does not affect the 
sum asymptotically since all the contribution come from s = o(K). 



Case 3. The analysis of the ¥(TC(ro + 1)) term in Case 2 works just as 
well for N slightly greater than ro(K + 1), and this becomes the f r term in 
the last line of the table for r = ro + 1. Since ro > 3, Case 2 handles the f r 
terms for r > 4. It remains only to analyze the f% term appearing in line 2 
of the table. 

We borrow the analysis from Case 4. Now the event 7~t(2) cannot happen, 
so we need to evaluate F(7i(3)), show it gives the asymptotics stated in the 
theorem and then show that adding P("H*(4)) does not alter the asymptotics. 
Let N = 2(K + 1) + j. Assume j% = 1, so the first interval thrown out of C is 
[1,K + 1]. To cover in three intervals, the second interval thrown out must 
overlap the first or be contiguous to it: otherwise C will be two disjoint 
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intervals and will have diameter more than K, whence one more step will 
not suffice to cover it. Again we may consider only the case where the second 
interval is contiguous to the right of the first and then double to count the 
case where the second is contiguous to the left of the first. The value of ji 
cannot be j or less, since this would leave C with cardinality greater than 
K + 1, which is too large a set to cover in one additional step. Thus, before 
doubling, the allowable range for ji is [j + 1, K + 2] . The corresponding range 
for j3 is [N — K,j2 + K + 1]. Equation (5.15) now becomes 

P(7*(3) ) _ 1 1 
2~ 

(5.17) 



K + 2 N - 1 + (K + l)(K + 1 + j) 

K+2 

v V *z± 



h ^ +i N-2 + (K + l)(K + 2-j 2 +j) 
l + o(l) K ^Z j 1-j/K-s/K 



K3Q.+J/K) fr{ j + l+j/K + s' 
which is bounded when j/K E [e, 1/2] and as t := j/K — > + due to 
K ^ S l-j/K-s/K K ^ 1 



{ j + l+j/K + s ~ j + l + j/K + S 

(K + 2 



by 



log(K(l -t) + 2 + Kt)- log(Kt) = log 



1+o(1 Wi 



V Kt 



K 3 b Vt 



Doubling yields, as a lower bound, the expression in the second line of Table 1 
for r = 3; for the upper bound, it remains to get an upper bound on P(H*(4)). 

We must sum this time over two types of sequences (1, j'2, Js)- The first 
are those with j + 2 ^ \j + 1,K + 2]; these do not appear in H.(3) because it 
is not possible to cover [N] in three intervals starting this way. The second 
are sequences where j\ £ [j + 1,K + 2] but (l,j2, J3) ^ 5(3); these do not 
appear in TC(3) because the third interval did not complete the cover of [TV], 
where a different choice of j'3 could have completed the cover. Analyzing the 
second of these two types repeats the analysis from the last paragraph of 
Case 4. That is, allowing these values of js replaces 1 — j/K — s/K in the 
numerator of (5.17) by 1 + j/K + s/K, which does not affect the leading 
term when j = o(K) and otherwise multiplies by a bounded factor, which 
we absorb into the definition of fs. 

The first of the two types of sequences splits into subtypes: —j < j'2 < j 
(in which case you do not cover enough new ground to be able to complete 
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coverage in three steps) or K + 3 < j 2 < K + 1 + j [in which case the set 
missed(2,j) splits into two intervals and cannot be covered by one more 
interval]. For the first subtype, M(3, j) is always at least K, so the sum over 
sequences of this subtype is 0(K~ 3 ). For the second subtype, M(3,j) =j. 
For each t there is exactly one value of j 2 for which missed(2, j) is composed 
of disjoint intervals sizes t and j — t in that order. Given that this occurs for 
some t, one may reason as in (5.3) to see that Q( j) < 2/ (t(j — t)). Thus the 
total probability of the second subtype is bounded above by 

ck-^^- = o(k-^\ 

and since this is negligible compared to K~ 3 \og(K/j), the proof in Case 3 
is complete. 

6. The fat tail when K — 1. In this section, we prove Theorem 2.6. For 
convenience we add a variable Yq to get an i.i.d. collection C := {Yo,Yj,Yj t i : 1 < 
j < N, < i < 1} and define the event H to hold when Y V Yi > Yx,o V Y lt i. 
Letting H* = H n f]f =1 H' jt it is evident that 

P(W*)>Pfet(JV + M) 

by monotonicity of probability, and from Harris' (positive association) in- 
equality we see that 

Pfat (iv + 2, i) > P(n' N+2 n n' N+1 n • ■ • n n' 2 ) ■ P(wi) = P(H*) ■ P(Hi). 

Since ¥(TC'i) = c > independently of N, it suffices to prove Theorem 2.6 for 
Pn ■= in places of P{at (N, 1). 

Having sliced open the circle, it is possible to derive a recursion for p^. 
Observe that the order of the variables in C, namely {Yj, Yjj, Yq : 1 < j < 
N, < i < 1}, is uniform among the (3N + 1)! permutations, and that the 
permutation determines whether TC* has occurred. For 7i* to occur, it is 
necessary that the maximum M of variables in C be Yj for some j. Thus 

N 1 

These conditional probabilities may be evaluated recursively. If Yq = M, 
then further information about Y\ q and Y\ \ is irrelevant and the ordering 
of the remaining 3iV — 2 variables is uniform, leading to 

P(H*\Y = M)= PN - 1 . 

To ensure this holds for N = 1, we set po := 1. Similarly, 

F(H*\Y N = M)= PN _ 1 . 
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Now suppose N > 2 and Yj = M for some 1< j < N - 1. Then 7^- and 
are known to occur. Removing from consideration the variables Yj, Kj and 
for i = 0, 1, the remaining variables are broken into two subsets of size 
3(j — 1) + 1 and 3(iV — j — 1) + 1; the ordering on the union of these is still 
jointly uniform, leading to 

F(H*\Y j =M)=p j - lPN _ j _ 1 . 

This equation is readily verified for N > 2 and j = 1 or j = N — 1 as well. 
Putting these together gives the recursion 

if iV " 1 x 

Pn = 5 ,n + 3N + 1 ( 2^-1 + Pj~iPN 



(6.1) 



J'-1 

J = l 



which holds for all N due to the inclusion of the delta function. 

Let f(z) := J2n=oPnz N • Since we know (by submultiplicativity) that 
logpjy/N — > log(A) for some A G (0,1), the radius of convergence for the 
power series defining / above will be 1/A. The generating function for 
(3A^ + 1)pn is equal to / + 3zf. The generating function for 5o,tv is 1, 
the generating function for 2pjy_i is 2zf and the generating function for 
Y^j=2Pj-2PN-j is z 2 f 2 . Equation (6.1) then becomes a Riccati equation: 

(6.2) f + 3 zf i = 1 + 2zf + z 2 f 2 . 

From the derivation it is apparent that this functional equation has a 
unique formal power series solution, /, and since \pn\ < 1 for ah N, the se- 
ries represents a function, also denoted /, that is analytic in a neighborhood 
of the origin. Only one locally analytic function can satisfy (6.2). To see 
this, write g(z) = z/(z 3 ) so that g' = 1 + 2z 2 g + z 4 g 2 := F(z, g) with bound- 
ary value (7(0) = 0. Since F is bounded and Lipschitz in a neighborhood of 
the origin, Gronwall's lemma ([5] or implicit in the classical uniqueness re- 
sult [1], Theorem 2.2) says there is at most one such g in the set of functions 
differ entiable near 0. 

Thus / is the unique locally analytic solution to (6.2), whence we may use 
Maple's ordinary differential equation solver to find solutions to (6.2) and 
be rigorously assured that any such solution we can verify by differentiation 
must equal /. One finds that for any real constant A, there is a solution /a 
which is a ratio of Bessel functions. Its numerator is equal to 

num : = (ABesselI(-|,|v / 2i) + BesselK(i |>/2z)) 
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and its denominator is equal to 

den:= y/z(-Ay/2 Bessell(|, \y/2z) + A^ BesselI(-±, \\[2~z) 

+ y/2 BesselK(|, fv^i) + yfz BesselK(±, \\FTz))\ 

here Bessell and BesselK denote modified Bessel functions of the first and 
second kinds, respectively. It is not yet clear whether one of these solutions 
is/. 

As a fractional power series, Ja has a leading term of z 1 / 3 , so certainly 
if /a = /, then A must be chosen to make this term vanish. Solving for A 
yields A = —iry/3/3, and plugging this into the expressions for num and 
den leads to a function with a power series, a priori fractional, beginning 

with 1 + z/2 H . The series converges in a neighborhood of the origin, so it 

defines a function that is 1 + O(z) near z = 0. Any function that is 1 + 0{z) 
near the origin and satisfies the differential equation (6.2) must be analytic 
in a neighborhood of the origin. We have therefore found the function /. 

Since / has positive coefficients, its minimal modulus singularities lie on 
the positive real axis. Its functional form dictates that / has positive real 
singularities precisely at the zeros of den. We may approximate these as 
closely as we wish. Maple's numeric solver gives zq := 1.803034611 . . . (the 
constant is not recognized by Plouffe's inverse symbolic calculator). Thus 

_> _ bg Zo = -0.58947114 

which finishes the proof of Theorem 2.6. 
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